{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "possible-limitation",
   "metadata": {},
   "source": [
    "# MindSpore实战一，MNIST手写体识别"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "instrumental-governor",
   "metadata": {},
   "source": [
    "数据集介绍：\n",
    "* MNIST数据集来自美国国家标准与技术研究所，National Institute of Standards and Technology(NIST),数据集由来自250个不同人手写的数字构成，其中50%是高中学生，50%来自人口普查局（the Census Bureau）的工作人员。\n",
    "\n",
    "* 训练集：60000，测试集：10000\n",
    "\n",
    "* MNIST数据集可在 http://yann.lecun.com/exdb/mnist/ 获取。\n",
    "\n",
    "<img src=\"image/01.png\">"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "speaking-theorem",
   "metadata": {},
   "source": [
    "本实验使用MindSpore深度学习框架，进行网络搭建、数据处理、网络训练和测试，完成MNIST手写体识别任务。\n",
    "\n",
    "实验流程：\n",
    "\n",
    "<img src=\"image/02.png\">"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "colonial-litigation",
   "metadata": {},
   "source": [
    "## 环境准备\n",
    "* MindSpore模块主要用于本次实验卷积神经网络的构建，包括很多子模块。\n",
    "    * mindspore.dataset：包括MNIST数据集的载入与处理，也可以自定义数据集。\n",
    "\n",
    "    * mindspore.common：包中会有诸如type形态转变、权重初始化等的常规工具。\n",
    "\n",
    "    * mindspore.nn：主要包括网络可能涉及到的各类网络层，诸如卷积层、池化层、全连接层，也包括损失函数，激活函数等。\n",
    "    \n",
    "    * Model：承载网络结构，并能够调用优化器、损失函数、评价指标。\n",
    "\n",
    "\n",
    "本实验需要以下第三方库：\n",
    "1. MindSpore 1.1.1\n",
    "2. Numpy 1.17.5"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "favorite-gregory",
   "metadata": {},
   "outputs": [],
   "source": [
    "# mindspore.dataset\n",
    "import mindspore.dataset as ds # 数据集的载入\n",
    "import mindspore.dataset.transforms.c_transforms as C # 常用转化算子\n",
    "import mindspore.dataset.vision.c_transforms as CV # 图像转化算子\n",
    "\n",
    "# mindspore.common\n",
    "from mindspore.common import dtype as mstype # 数据形态转换\n",
    "from mindspore.common.initializer import Normal # 参数初始化\n",
    "\n",
    "# mindspore.nn\n",
    "import mindspore.nn as nn # 各类网络层都在nn里面\n",
    "from mindspore.nn.metrics import Accuracy # 测试模型用\n",
    "\n",
    "\n",
    "from mindspore import Model # 承载网络结构\n",
    "\n",
    "from mindspore.train.callback import LossMonitor\n",
    "\n",
    "# os模块处理数据路径用\n",
    "import os\n",
    "\n",
    "# numpy\n",
    "import numpy as np"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "bound-revision",
   "metadata": {},
   "source": [
    "## 数据处理"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "banned-department",
   "metadata": {},
   "source": [
    "定义数据预处理函数。\n",
    "\n",
    "函数功能包括：\n",
    "1. 加载数据集\n",
    "1. 打乱数据集\n",
    "1. 图像特征处理（标准化、通道转换等）\n",
    "3. 批量输出数据\n",
    "4. 重复"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "herbal-nepal",
   "metadata": {},
   "outputs": [],
   "source": [
    "def create_dataset(data_path, batch_size=32):\n",
    "    \"\"\" \n",
    "    数据预处理与批量输出的函数\n",
    "    \n",
    "    Args:\n",
    "        data_path: 数据路径\n",
    "        batch_size: 批量大小\n",
    "    \"\"\"\n",
    "    \n",
    "    # 定义数据集\n",
    "    data = ds.MnistDataset(data_path)\n",
    "    \n",
    "    # 打乱数据集\n",
    "    data = data.shuffle(buffer_size=10000)\n",
    "    \n",
    "    # 数据标准化参数\n",
    "    # MNIST数据集的 mean = 33.3285，std = 78.5655\n",
    "    mean, std = 33.3285, 78.5655 \n",
    "\n",
    "    # 定义算子\n",
    "    nml_op = lambda x : np.float32((x-mean)/std) # 数据标准化，image = (image-mean)/std\n",
    "    hwc2chw_op = CV.HWC2CHW() # 通道前移（为配适网络，CHW的格式可最佳发挥昇腾芯片算力）\n",
    "    type_cast_op = C.TypeCast(mstype.int32) # 原始数据的标签是unint，计算损失需要int\n",
    "\n",
    "    # 算子运算\n",
    "    data = data.map(operations=type_cast_op, input_columns='label')\n",
    "    data = data.map(operations=nml_op, input_columns='image')\n",
    "    data = data.map(operations=hwc2chw_op, input_columns='image')\n",
    "\n",
    "    # 批处理\n",
    "    data = data.batch(batch_size)\n",
    "    \n",
    "    # 重复\n",
    "    data = data.repeat(1)\n",
    "\n",
    "    return data"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "meaningful-matrix",
   "metadata": {},
   "source": [
    "## 网络定义\n",
    "### 参考LeNet网络结构，构建网络\n",
    "LeNet-5出自论文《Gradient-Based Learning Applied to Document Recognition》，原本是一种用于手写体字符识别的非常高效的卷积神经网络，包含了深度学习的基本模块：卷积层，池化层，全连接层。\n",
    "\n",
    "本实验将参考LeNet论文，建立以下网络：\n",
    "<img src=\"image/03.png\">\n",
    "\n",
    "1.\tINPUT（输入层） ：输入28∗28的图片。\n",
    "2.\tC1（卷积层）：选取6个5∗5卷积核(不包含偏置)，得到6个特征图，每个特征图的一个边为28−5+1=24。\n",
    "3.\tS2（池化层）：池化层是一个下采样层，输出12∗12∗6的特征图。\n",
    "4.\tC3（卷积层）：选取16个大小为5∗5卷积核，得到特征图大小为8∗8∗16。\n",
    "5.\tS4（池化层）：窗口大小为2∗2，输出4∗4∗16的特征图。\n",
    "6.\tF5（全连接层）：120个神经元。\n",
    "7.\tF6（全连接层）：84个神经元。\n",
    "8.\tOUTPUT（输出层）：10个神经元，10分类问题。\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "outdoor-september",
   "metadata": {},
   "outputs": [],
   "source": [
    "class LeNet5(nn.Cell):\n",
    "    \n",
    "    # 定义算子\n",
    "    def __init__(self, num_class=10, num_channel=1):\n",
    "        super(LeNet5, self).__init__()\n",
    "        # 卷积层\n",
    "        self.conv1 = nn.Conv2d(num_channel, 6, 5, pad_mode='valid')\n",
    "        self.conv2 = nn.Conv2d(6, 16, 5, pad_mode='valid')\n",
    "        \n",
    "        # 全连接层\n",
    "        self.fc1 = nn.Dense(4 * 4 * 16, 120, weight_init=Normal(0.02))\n",
    "        self.fc2 = nn.Dense(120, 84, weight_init=Normal(0.02))\n",
    "        self.fc3 = nn.Dense(84, num_class, weight_init=Normal(0.02))\n",
    "        \n",
    "        # 激活函数\n",
    "        self.relu = nn.ReLU()\n",
    "        \n",
    "        # 最大池化成\n",
    "        self.max_pool2d = nn.MaxPool2d(kernel_size=2, stride=2)\n",
    "        \n",
    "        # 网络展开\n",
    "        self.flatten = nn.Flatten()\n",
    "        \n",
    "    # 建构网络\n",
    "    def construct(self, x):\n",
    "        x = self.conv1(x)\n",
    "        x = self.relu(x)\n",
    "        x = self.max_pool2d(x)\n",
    "        x = self.conv2(x)\n",
    "        x = self.relu(x)\n",
    "        x = self.max_pool2d(x)\n",
    "        x = self.flatten(x)\n",
    "        x = self.fc1(x)\n",
    "        x = self.relu(x)\n",
    "        x = self.fc2(x)\n",
    "        x = self.relu(x)\n",
    "        x = self.fc3(x)\n",
    "        return x"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "northern-northern",
   "metadata": {},
   "source": [
    "## 模型训练"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "together-homework",
   "metadata": {},
   "source": [
    "载入数据集"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "smaller-association",
   "metadata": {},
   "outputs": [],
   "source": [
    "train_path = os.path.join('data','train') # 训练集路径\n",
    "train_data = create_dataset(train_path) # 定义训练数据集\n",
    "\n",
    "test_path = os.path.join('data','test') # 测试集路径\n",
    "test_data = create_dataset(test_path) # 定义测试数据集"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "pleased-rough",
   "metadata": {},
   "source": [
    "定义网络、损失函数、优化器、模型"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "stupid-nurse",
   "metadata": {},
   "outputs": [],
   "source": [
    "# 网络\n",
    "net = LeNet5()\n",
    "\n",
    "# 损失函数\n",
    "net_loss = nn.SoftmaxCrossEntropyWithLogits(sparse=True, reduction='mean')\n",
    "\n",
    "# 优化器\n",
    "lr = 0.01\n",
    "momentum = 0.9\n",
    "net_opt = nn.Momentum(net.trainable_params(), lr, momentum)\n",
    "\n",
    "loss_cb = LossMonitor(per_print_times=train_data.get_dataset_size())\n",
    "\n",
    "# 模型\n",
    "model = Model(net, net_loss, net_opt, metrics={'accuracy': Accuracy()})"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "convinced-exposure",
   "metadata": {},
   "source": [
    "训练模型"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "opening-television",
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "[WARNING] ME(3552:1084,MainProcess):2022-01-09-16:23:01.416.571 [mindspore\\train\\model.py:500] The CPU cannot support dataset sink mode currently.So the training process will be performed with dataset not sink.\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "epoch: 1 step: 1875, loss is 0.14849165\n",
      "epoch: 2 step: 1875, loss is 0.03422373\n",
      "epoch: 3 step: 1875, loss is 0.013453491\n"
     ]
    }
   ],
   "source": [
    "model.train(3, train_data, callbacks=[loss_cb]) # 训练3个epoch"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "seventh-syria",
   "metadata": {},
   "source": [
    "## 模型评估\n",
    "查看模型在测试集的准确率"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "removable-diary",
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "[WARNING] ME(7428:8284,MainProcess):2021-12-28-14:13:50.563.74 [mindspore\\train\\model.py:893] CPU cannot support dataset sink mode currently.So the evaluating process will be performed with dataset non-sink mode.\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "{'accuracy': 0.9819}"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "model.eval(test_data) # 测试网络"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "adaptive-compiler",
   "metadata": {},
   "source": [
    "## 效果展示"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "chronic-fraction",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "True: [4], Predicted: 4\n",
      "True: [1], Predicted: 1\n",
      "True: [2], Predicted: 2\n",
      "True: [0], Predicted: 0\n",
      "True: [8], Predicted: 8\n",
      "True: [7], Predicted: 7\n",
      "True: [3], Predicted: 3\n",
      "True: [0], Predicted: 0\n",
      "True: [5], Predicted: 5\n",
      "True: [1], Predicted: 1\n"
     ]
    }
   ],
   "source": [
    "data_path=os.path.join('data', 'test')\n",
    "\n",
    "ds_test_demo = create_dataset(test_path, batch_size=1)\n",
    "\n",
    "for i, dic in enumerate(ds_test_demo.create_dict_iterator()):\n",
    "    input_img = dic['image']\n",
    "    output = model.predict(input_img)\n",
    "    predict = np.argmax(output.asnumpy(),axis=1)[0]\n",
    "    if i>9:\n",
    "        break\n",
    "    print('True: %s, Predicted: %s'%(dic['label'], predict))\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "further-syntax",
   "metadata": {},
   "source": [
    "## 思考\n",
    "1. 请描述MindSpore的基础数据处理流程。\n",
    "    * 答：数据加载 > shuffle > map > batch > repeat\n",
    "2. 定义网络时需要继承哪一个基类？\n",
    "    * 答：mindspore.nn.Cell\n",
    "3. 定义网络时有哪些必须编写哪两个函数？\n",
    "    * 答：\\_\\_init__()，construct()。\n",
    "4. 思考3中提到的两个函数有什么用途？\n",
    "    * 答：一般会在\\_\\_init__()中定义算子，然后在construct()中定义网络结构。\\_\\_init__()中的语句由Python解析执行；construct()中的语句由MindSpore接管，有语法限制；"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "neither-revelation",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "MindSpore_1.5.0",
   "language": "python",
   "name": "mindspore1.5"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
